📚 node [[numerical_data|numerical data]]
Welcome! Nobody has contributed anything to 'numerical_data|numerical data' yet. You can:
  • Write something in the document below!
    • There is at least one public document in every node in the Agora. Whatever you write in it will be integrated and made available for the next visitor to read and edit.
  • Write to the Agora from social media.
    • If you follow Agora bot on a supported platform and include the wikilink [[numerical_data|numerical data]] in a post, the Agora will link it here and optionally integrate your writing.
  • Sign up as a full Agora user.
    • As a full user you will be able to contribute your personal notes and resources directly to this knowledge commons. Some setup required :)
⥅ related node [[numerical_data]]
⥅ node [[numerical_data]] pulled by Agora

numerical data

Go back to the [[AI Glossary]]

Features represented as integers or real-valued numbers. For example, in a real estate model, you would probably represent the size of a house (in square feet or square meters) as numerical data. Representing a feature as numerical data indicates that the feature's values have a mathematical relationship to each other and possibly to the label. For example, representing the size of a house as numerical data indicates that a 200 square-meter house is twice as large as a 100 square-meter house. Furthermore, the number of square meters in a house probably has some mathematical relationship to the price of the house.

Not all integer data should be represented as numerical data. For example, postal codes in some parts of the world are integers; however, integer postal codes should not be represented as numerical data in models. That's because a postal code of 20000 is not twice (or half) as potent as a postal code of 10000. Furthermore, although different postal codes do correlate to different real estate values, we can't assume that real estate values at postal code 20000 are twice as valuable as real estate values at postal code 10000. Postal codes should be represented as categorical data instead.

Numerical features are sometimes called continuous features.

📖 stoas
⥱ context